Skip to content

Conversation

rly
Copy link
Contributor

@rly rly commented Aug 20, 2024

Proposed changes to #1171.

It would be nice if HDF5IO and the ObjectMapper can accept any Zarr array just like it accepts any Numpy array, instead of accepting only Zarr arrays that have been written with hdmf-zarr.

@rly rly requested a review from oruebel August 20, 2024 08:27
@rly rly marked this pull request as draft August 20, 2024 08:27
Copy link

codecov bot commented Aug 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.91%. Comparing base (2b90d21) to head (bb3de8b).

Additional details and impacted files
@@                   Coverage Diff                   @@
##           fix/zarr_str_export    #1175      +/-   ##
=======================================================
+ Coverage                88.83%   88.91%   +0.08%     
=======================================================
  Files                       45       45              
  Lines                     9841     9846       +5     
  Branches                  2798     2797       -1     
=======================================================
+ Hits                      8742     8755      +13     
+ Misses                     779      775       -4     
+ Partials                   320      316       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

# NOTE: Numpy < 2.0 has only fixed-length strings.
# Numpy 2.0 introduces variable-length strings (dtype=np.dtypes.StringDType()).
# HDMF does not yet do any special handling of numpy arrays with variable-length strings.
if isinstance(value, np.ndarray) or _is_zarr_array(value):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The basic approach seems fine. A couple of thoughts:

  • I think instead of the if isinstance(value, np.ndarray) or _is_zarr_array(value) it would be useful to split this case into if and elif cases to treat numpy and zarr separately here. I think that would be more readable.
  • I'd move the _is_zarr_array method ot the utils or data_utils
  • Alternatively to this approach we could also use hasattr(value, filters) to distinguish been Zarr and numpy arrays, because numpy arrays don't have a filters attribute which is used in Zarr to store all the filters, including the encoder for strings.

@rly rly added this to the 4.1.0 milestone Apr 3, 2025
@rly rly modified the milestones: 4.1.0, 4.2.0 May 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants